Skip to content

Install NRT by default with topology-updater #2194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fmuyassarov
Copy link
Member

Since topology-updater relies on the presence of the NRT (NodeResourceTopology) CRDs to function correctly, it should be enabled by default. This prevents accidental pod failures due to a missing but required component. Instead of requiring users to explicitly enable it, we now require them to explicitly disable it if needed.

Since topology-updater relies on the presence of the NRT
(NodeResourceTopology) CRDs to function correctly, it should
be enabled by default. This prevents accidental pod failures
due to a missing but required component. Instead of requiring
users to explicitly enable it, we now require them to
explicitly disable it if needed.

Signed-off-by: Feruzjon Muyassarov <[email protected]>
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fmuyassarov
Once this PR has been reviewed and has the lgtm label, please assign arangogutierrez for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 10, 2025
@k8s-ci-robot k8s-ci-robot requested review from kad and TessaIO July 10, 2025 12:30
@fmuyassarov
Copy link
Member Author

/assign @marquiz

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 10, 2025
Copy link

netlify bot commented Jul 10, 2025

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit 0064da6
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-nfd/deploys/686fb26d9b24a7000843318a
😎 Deploy Preview https://deploy-preview-2194--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this might be a bad idea because NFD does not "own" the CRD and it's used also elsewhere.

On a quick thought I see two possible main problem scenarios:

  1. If NodeResourceTopology is already present (installed via some other means) NFD deployment would fail, AFAIU
  2. CRDs will be deleted on NFD uninstall which could break some other users of the CRD

Thoughts?

/cc @PiotrProkop @adrianchiris

@fmuyassarov
Copy link
Member Author

fmuyassarov commented Jul 28, 2025

Hmm, I think this might be a bad idea because NFD does not "own" the CRD and it's used also elsewhere.

On a quick thought I see two possible main problem scenarios:

1. If NodeResourceTopology is already present (installed via some other means) NFD deployment would fail, AFAIU

I'm afraid it won't fail and rather skip the installation if CRD is already present on the cluster.

2. CRDs will be deleted on NFD uninstall which could break some other users of the CRD

That's true and actually weird. Because, those CRDs that are "owned" by NFD won't be uninstalled after helm uninstall while the ones installed conditionally (like NRT) will be deleted. I would normally expect the opposite but perhaps I'm missing something in terms of how Helm decides which resources it should delete.

I agree that it's problematic to delete a resource that might be used by other components in the cluster. One way to prevent the deletion of the NRT CRD is to place it under the crds/ directory, similar to how other resources are handled. This approach ensures that the CRD is not removed during a helm uninstall, and more importantly, it gets installed by default during a helm install, reducing the risk of missing the CRD installation.

Moving the CRD to crds/ means that NRT will be installed even if the topology updater isn't deployed. However, I don't believe that's an issue, since we already install other NFD resources that may not be used during the NFD lifecycle.

@marquiz
Copy link
Contributor

marquiz commented Jul 28, 2025

I'm afraid it won't fail and rather skip the installation if CRD is already present on the cluster.

Just tested, it actually does fail:

$ helm -n nfd install nfd nfd/node-feature-discovery  --set topologyUpdater.enable=true --set topologyUpdater.createCRDs=true
Error: INSTALLATION FAILED: Unable to continue with install: CustomResourceDefinition "noderesourcetopologies.topology.node.k8s.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "nfd"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "nfd"

@ffromani
Copy link
Contributor

ffromani commented Aug 5, 2025

the problem here is nobody really owns this CRD indeed. My take on this is we should be careful and deliberate:

  1. if the user wants nfd-topology-updater then we should also install the CRD but
  2. there should be a way to opt out and don't install the CRD, but only nfd-topology-updater

this is more complex, but it seems to me it's the way which most conciles userfriendliness and good ecosystem citizenship. I'm open to discuss :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants